Overview

Dataset statistics

Number of variables16
Number of observations45211
Missing cells0
Missing cells (%)0.0%
Duplicate rows0
Duplicate rows (%)0.0%
Total size in memory5.5 MiB
Average record size in memory128.0 B

Variable types

Numeric8
Categorical8

Alerts

month is highly overall correlated with housing and 3 other fieldsHigh correlation
pdays is highly overall correlated with month and 1 other fieldsHigh correlation
previous is highly overall correlated with pdays and 1 other fieldsHigh correlation
housing is highly overall correlated with monthHigh correlation
contact is highly overall correlated with monthHigh correlation
poutcome is highly overall correlated with pdaysHigh correlation
day is highly overall correlated with monthHigh correlation
previous is highly skewed (γ1 = 41.84645447)Skewed
balance has 3514 (7.8%) zerosZeros
previous has 36954 (81.7%) zerosZeros

Reproduction

Analysis started2022-12-05 13:36:57.728840
Analysis finished2022-12-05 13:37:14.391912
Duration16.66 seconds
Software versionpandas-profiling vv3.5.0
Download configurationconfig.json

Variables

age
Real number (ℝ)

Distinct77
Distinct (%)0.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.93621
Minimum18
Maximum95
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size353.3 KiB
2022-12-05T14:37:14.516209image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum18
5-th percentile27
Q133
median39
Q348
95-th percentile59
Maximum95
Range77
Interquartile range (IQR)15

Descriptive statistics

Standard deviation10.618762
Coefficient of variation (CV)0.25939778
Kurtosis0.31957038
Mean40.93621
Median Absolute Deviation (MAD)7
Skewness0.68481793
Sum1850767
Variance112.75811
MonotonicityNot monotonic
2022-12-05T14:37:14.667204image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
32 2085
 
4.6%
31 1996
 
4.4%
33 1972
 
4.4%
34 1930
 
4.3%
35 1894
 
4.2%
36 1806
 
4.0%
30 1757
 
3.9%
37 1696
 
3.8%
39 1487
 
3.3%
38 1466
 
3.2%
Other values (67) 27122
60.0%
ValueCountFrequency (%)
18 12
 
< 0.1%
19 35
 
0.1%
20 50
 
0.1%
21 79
 
0.2%
22 129
 
0.3%
23 202
 
0.4%
24 302
 
0.7%
25 527
1.2%
26 805
1.8%
27 909
2.0%
ValueCountFrequency (%)
95 2
 
< 0.1%
94 1
 
< 0.1%
93 2
 
< 0.1%
92 2
 
< 0.1%
90 2
 
< 0.1%
89 3
 
< 0.1%
88 2
 
< 0.1%
87 4
< 0.1%
86 9
< 0.1%
85 5
< 0.1%

marital
Categorical

Distinct3
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size353.3 KiB
1
27214 
0
12790 
2
5207 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters45211
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row0
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1 27214
60.2%
0 12790
28.3%
2 5207
 
11.5%

Length

2022-12-05T14:37:14.794467image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-05T14:37:14.912468image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
1 27214
60.2%
0 12790
28.3%
2 5207
 
11.5%

Most occurring characters

ValueCountFrequency (%)
1 27214
60.2%
0 12790
28.3%
2 5207
 
11.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 45211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 27214
60.2%
0 12790
28.3%
2 5207
 
11.5%

Most occurring scripts

ValueCountFrequency (%)
Common 45211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 27214
60.2%
0 12790
28.3%
2 5207
 
11.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 27214
60.2%
0 12790
28.3%
2 5207
 
11.5%

education
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size353.3 KiB
2
23202 
3
13301 
1
6851 
0
 
1857

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters45211
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row3
2nd row2
3rd row2
4th row0
5th row0

Common Values

ValueCountFrequency (%)
2 23202
51.3%
3 13301
29.4%
1 6851
 
15.2%
0 1857
 
4.1%

Length

2022-12-05T14:37:15.010469image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-05T14:37:15.123583image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
2 23202
51.3%
3 13301
29.4%
1 6851
 
15.2%
0 1857
 
4.1%

Most occurring characters

ValueCountFrequency (%)
2 23202
51.3%
3 13301
29.4%
1 6851
 
15.2%
0 1857
 
4.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 45211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 23202
51.3%
3 13301
29.4%
1 6851
 
15.2%
0 1857
 
4.1%

Most occurring scripts

ValueCountFrequency (%)
Common 45211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
2 23202
51.3%
3 13301
29.4%
1 6851
 
15.2%
0 1857
 
4.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 23202
51.3%
3 13301
29.4%
1 6851
 
15.2%
0 1857
 
4.1%

default
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size353.3 KiB
0
44396 
1
 
815

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters45211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 44396
98.2%
1 815
 
1.8%

Length

2022-12-05T14:37:15.225542image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-05T14:37:15.321545image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 44396
98.2%
1 815
 
1.8%

Most occurring characters

ValueCountFrequency (%)
0 44396
98.2%
1 815
 
1.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 45211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 44396
98.2%
1 815
 
1.8%

Most occurring scripts

ValueCountFrequency (%)
Common 45211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 44396
98.2%
1 815
 
1.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 44396
98.2%
1 815
 
1.8%

balance
Real number (ℝ)

Distinct7168
Distinct (%)15.9%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean1362.2721
Minimum-8019
Maximum102127
Zeros3514
Zeros (%)7.8%
Negative3766
Negative (%)8.3%
Memory size353.3 KiB
2022-12-05T14:37:15.435541image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum-8019
5-th percentile-172
Q172
median448
Q31428
95-th percentile5768
Maximum102127
Range110146
Interquartile range (IQR)1356

Descriptive statistics

Standard deviation3044.7658
Coefficient of variation (CV)2.2350644
Kurtosis140.75155
Mean1362.2721
Median Absolute Deviation (MAD)448
Skewness8.3603083
Sum61589682
Variance9270599
MonotonicityNot monotonic
2022-12-05T14:37:15.573579image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
0 3514
 
7.8%
1 195
 
0.4%
2 156
 
0.3%
4 139
 
0.3%
3 134
 
0.3%
5 113
 
0.2%
6 88
 
0.2%
8 81
 
0.2%
23 75
 
0.2%
7 69
 
0.2%
Other values (7158) 40647
89.9%
ValueCountFrequency (%)
-8019 1
< 0.1%
-6847 1
< 0.1%
-4057 1
< 0.1%
-3372 1
< 0.1%
-3313 1
< 0.1%
-3058 1
< 0.1%
-2827 1
< 0.1%
-2712 1
< 0.1%
-2604 1
< 0.1%
-2282 1
< 0.1%
ValueCountFrequency (%)
102127 1
< 0.1%
98417 1
< 0.1%
81204 2
< 0.1%
71188 1
< 0.1%
66721 1
< 0.1%
66653 1
< 0.1%
64343 1
< 0.1%
59649 1
< 0.1%
58932 1
< 0.1%
58544 1
< 0.1%

housing
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size353.3 KiB
1
25130 
0
20081 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters45211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row1
2nd row1
3rd row1
4th row1
5th row0

Common Values

ValueCountFrequency (%)
1 25130
55.6%
0 20081
44.4%

Length

2022-12-05T14:37:15.687749image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-05T14:37:15.793741image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
1 25130
55.6%
0 20081
44.4%

Most occurring characters

ValueCountFrequency (%)
1 25130
55.6%
0 20081
44.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 45211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 25130
55.6%
0 20081
44.4%

Most occurring scripts

ValueCountFrequency (%)
Common 45211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 25130
55.6%
0 20081
44.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 25130
55.6%
0 20081
44.4%

loan
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size353.3 KiB
0
37967 
1
7244 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters45211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row1
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 37967
84.0%
1 7244
 
16.0%

Length

2022-12-05T14:37:15.872800image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-05T14:37:15.969748image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 37967
84.0%
1 7244
 
16.0%

Most occurring characters

ValueCountFrequency (%)
0 37967
84.0%
1 7244
 
16.0%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 45211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 37967
84.0%
1 7244
 
16.0%

Most occurring scripts

ValueCountFrequency (%)
Common 45211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 37967
84.0%
1 7244
 
16.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 37967
84.0%
1 7244
 
16.0%

contact
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size353.3 KiB
1
32191 
0
13020 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters45211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
1 32191
71.2%
0 13020
28.8%

Length

2022-12-05T14:37:16.058836image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-05T14:37:16.162873image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
1 32191
71.2%
0 13020
28.8%

Most occurring characters

ValueCountFrequency (%)
1 32191
71.2%
0 13020
28.8%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 45211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 32191
71.2%
0 13020
28.8%

Most occurring scripts

ValueCountFrequency (%)
Common 45211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
1 32191
71.2%
0 13020
28.8%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 32191
71.2%
0 13020
28.8%

day
Real number (ℝ)

Distinct31
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean15.806419
Minimum1
Maximum31
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size353.3 KiB
2022-12-05T14:37:16.257882image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile3
Q18
median16
Q321
95-th percentile29
Maximum31
Range30
Interquartile range (IQR)13

Descriptive statistics

Standard deviation8.3224762
Coefficient of variation (CV)0.52652509
Kurtosis-1.0598974
Mean15.806419
Median Absolute Deviation (MAD)7
Skewness0.093079014
Sum714624
Variance69.263609
MonotonicityNot monotonic
2022-12-05T14:37:16.369836image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=31)
ValueCountFrequency (%)
20 2752
 
6.1%
18 2308
 
5.1%
21 2026
 
4.5%
17 1939
 
4.3%
6 1932
 
4.3%
5 1910
 
4.2%
14 1848
 
4.1%
8 1842
 
4.1%
28 1830
 
4.0%
7 1817
 
4.0%
Other values (21) 25007
55.3%
ValueCountFrequency (%)
1 322
 
0.7%
2 1293
2.9%
3 1079
2.4%
4 1445
3.2%
5 1910
4.2%
6 1932
4.3%
7 1817
4.0%
8 1842
4.1%
9 1561
3.5%
10 524
 
1.2%
ValueCountFrequency (%)
31 643
 
1.4%
30 1566
3.5%
29 1745
3.9%
28 1830
4.0%
27 1121
2.5%
26 1035
2.3%
25 840
1.9%
24 447
 
1.0%
23 939
2.1%
22 905
2.0%

month
Real number (ℝ)

Distinct12
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean6.1446551
Minimum1
Maximum12
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size353.3 KiB
2022-12-05T14:37:16.483837image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile2
Q15
median6
Q38
95-th percentile11
Maximum12
Range11
Interquartile range (IQR)3

Descriptive statistics

Standard deviation2.408034
Coefficient of variation (CV)0.39189083
Kurtosis0.048579
Mean6.1446551
Median Absolute Deviation (MAD)1
Skewness0.24284195
Sum277806
Variance5.7986276
MonotonicityNot monotonic
2022-12-05T14:37:16.580401image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=12)
ValueCountFrequency (%)
5 13766
30.4%
7 6895
15.3%
8 6247
13.8%
6 5341
 
11.8%
11 3970
 
8.8%
4 2932
 
6.5%
2 2649
 
5.9%
1 1403
 
3.1%
10 738
 
1.6%
9 579
 
1.3%
Other values (2) 691
 
1.5%
ValueCountFrequency (%)
1 1403
 
3.1%
2 2649
 
5.9%
3 477
 
1.1%
4 2932
 
6.5%
5 13766
30.4%
6 5341
 
11.8%
7 6895
15.3%
8 6247
13.8%
9 579
 
1.3%
10 738
 
1.6%
ValueCountFrequency (%)
12 214
 
0.5%
11 3970
 
8.8%
10 738
 
1.6%
9 579
 
1.3%
8 6247
13.8%
7 6895
15.3%
6 5341
 
11.8%
5 13766
30.4%
4 2932
 
6.5%
3 477
 
1.1%

duration
Real number (ℝ)

Distinct1573
Distinct (%)3.5%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean258.16308
Minimum0
Maximum4918
Zeros3
Zeros (%)< 0.1%
Negative0
Negative (%)0.0%
Memory size353.3 KiB
2022-12-05T14:37:16.714690image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile35
Q1103
median180
Q3319
95-th percentile751
Maximum4918
Range4918
Interquartile range (IQR)216

Descriptive statistics

Standard deviation257.52781
Coefficient of variation (CV)0.99753928
Kurtosis18.153915
Mean258.16308
Median Absolute Deviation (MAD)93
Skewness3.1443181
Sum11671811
Variance66320.574
MonotonicityNot monotonic
2022-12-05T14:37:16.859627image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
124 188
 
0.4%
90 184
 
0.4%
89 177
 
0.4%
104 175
 
0.4%
122 175
 
0.4%
114 175
 
0.4%
136 174
 
0.4%
139 174
 
0.4%
112 174
 
0.4%
121 173
 
0.4%
Other values (1563) 43442
96.1%
ValueCountFrequency (%)
0 3
 
< 0.1%
1 2
 
< 0.1%
2 3
 
< 0.1%
3 4
 
< 0.1%
4 15
 
< 0.1%
5 35
0.1%
6 45
0.1%
7 73
0.2%
8 85
0.2%
9 77
0.2%
ValueCountFrequency (%)
4918 1
< 0.1%
3881 1
< 0.1%
3785 1
< 0.1%
3422 1
< 0.1%
3366 1
< 0.1%
3322 1
< 0.1%
3284 1
< 0.1%
3253 1
< 0.1%
3183 1
< 0.1%
3102 1
< 0.1%

campaign
Real number (ℝ)

Distinct48
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean2.7638407
Minimum1
Maximum63
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size353.3 KiB
2022-12-05T14:37:17.000630image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median2
Q33
95-th percentile8
Maximum63
Range62
Interquartile range (IQR)2

Descriptive statistics

Standard deviation3.0980209
Coefficient of variation (CV)1.1209115
Kurtosis39.249651
Mean2.7638407
Median Absolute Deviation (MAD)1
Skewness4.8986502
Sum124956
Variance9.5977334
MonotonicityNot monotonic
2022-12-05T14:37:17.140190image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=48)
ValueCountFrequency (%)
1 17544
38.8%
2 12505
27.7%
3 5521
 
12.2%
4 3522
 
7.8%
5 1764
 
3.9%
6 1291
 
2.9%
7 735
 
1.6%
8 540
 
1.2%
9 327
 
0.7%
10 266
 
0.6%
Other values (38) 1196
 
2.6%
ValueCountFrequency (%)
1 17544
38.8%
2 12505
27.7%
3 5521
 
12.2%
4 3522
 
7.8%
5 1764
 
3.9%
6 1291
 
2.9%
7 735
 
1.6%
8 540
 
1.2%
9 327
 
0.7%
10 266
 
0.6%
ValueCountFrequency (%)
63 1
 
< 0.1%
58 1
 
< 0.1%
55 1
 
< 0.1%
51 1
 
< 0.1%
50 2
< 0.1%
46 1
 
< 0.1%
44 1
 
< 0.1%
43 3
< 0.1%
41 2
< 0.1%
39 1
 
< 0.1%

pdays
Real number (ℝ)

Distinct559
Distinct (%)1.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean40.197828
Minimum-1
Maximum871
Zeros0
Zeros (%)0.0%
Negative36954
Negative (%)81.7%
Memory size353.3 KiB
2022-12-05T14:37:17.287908image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum-1
5-th percentile-1
Q1-1
median-1
Q3-1
95-th percentile317
Maximum871
Range872
Interquartile range (IQR)0

Descriptive statistics

Standard deviation100.12875
Coefficient of variation (CV)2.4908994
Kurtosis6.9351952
Mean40.197828
Median Absolute Deviation (MAD)0
Skewness2.6157155
Sum1817384
Variance10025.766
MonotonicityNot monotonic
2022-12-05T14:37:17.424876image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
-1 36954
81.7%
182 167
 
0.4%
92 147
 
0.3%
91 126
 
0.3%
183 126
 
0.3%
181 117
 
0.3%
370 99
 
0.2%
184 85
 
0.2%
364 77
 
0.2%
95 74
 
0.2%
Other values (549) 7239
 
16.0%
ValueCountFrequency (%)
-1 36954
81.7%
1 15
 
< 0.1%
2 37
 
0.1%
3 1
 
< 0.1%
4 2
 
< 0.1%
5 11
 
< 0.1%
6 10
 
< 0.1%
7 7
 
< 0.1%
8 25
 
0.1%
9 12
 
< 0.1%
ValueCountFrequency (%)
871 1
< 0.1%
854 1
< 0.1%
850 1
< 0.1%
842 1
< 0.1%
838 1
< 0.1%
831 1
< 0.1%
828 1
< 0.1%
826 1
< 0.1%
808 1
< 0.1%
805 1
< 0.1%

previous
Real number (ℝ)

HIGH CORRELATION
SKEWED
ZEROS

Distinct41
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean0.58032337
Minimum0
Maximum275
Zeros36954
Zeros (%)81.7%
Negative0
Negative (%)0.0%
Memory size353.3 KiB
2022-12-05T14:37:17.551681image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile3
Maximum275
Range275
Interquartile range (IQR)0

Descriptive statistics

Standard deviation2.303441
Coefficient of variation (CV)3.9692371
Kurtosis4506.8607
Mean0.58032337
Median Absolute Deviation (MAD)0
Skewness41.846454
Sum26237
Variance5.3058406
MonotonicityNot monotonic
2022-12-05T14:37:17.669236image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram with fixed size bins (bins=41)
ValueCountFrequency (%)
0 36954
81.7%
1 2772
 
6.1%
2 2106
 
4.7%
3 1142
 
2.5%
4 714
 
1.6%
5 459
 
1.0%
6 277
 
0.6%
7 205
 
0.5%
8 129
 
0.3%
9 92
 
0.2%
Other values (31) 361
 
0.8%
ValueCountFrequency (%)
0 36954
81.7%
1 2772
 
6.1%
2 2106
 
4.7%
3 1142
 
2.5%
4 714
 
1.6%
5 459
 
1.0%
6 277
 
0.6%
7 205
 
0.5%
8 129
 
0.3%
9 92
 
0.2%
ValueCountFrequency (%)
275 1
< 0.1%
58 1
< 0.1%
55 1
< 0.1%
51 1
< 0.1%
41 1
< 0.1%
40 1
< 0.1%
38 2
< 0.1%
37 2
< 0.1%
35 1
< 0.1%
32 1
< 0.1%

poutcome
Categorical

Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size353.3 KiB
0
36959 
2
4901 
3
 
1840
1
 
1511

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters45211
Distinct characters4
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 36959
81.7%
2 4901
 
10.8%
3 1840
 
4.1%
1 1511
 
3.3%

Length

2022-12-05T14:37:17.781131image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-05T14:37:17.889184image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 36959
81.7%
2 4901
 
10.8%
3 1840
 
4.1%
1 1511
 
3.3%

Most occurring characters

ValueCountFrequency (%)
0 36959
81.7%
2 4901
 
10.8%
3 1840
 
4.1%
1 1511
 
3.3%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 45211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 36959
81.7%
2 4901
 
10.8%
3 1840
 
4.1%
1 1511
 
3.3%

Most occurring scripts

ValueCountFrequency (%)
Common 45211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 36959
81.7%
2 4901
 
10.8%
3 1840
 
4.1%
1 1511
 
3.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 36959
81.7%
2 4901
 
10.8%
3 1840
 
4.1%
1 1511
 
3.3%

Target
Categorical

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size353.3 KiB
0
39922 
1
5289 

Length

Max length1
Median length1
Mean length1
Min length1

Characters and Unicode

Total characters45211
Distinct characters2
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row0
2nd row0
3rd row0
4th row0
5th row0

Common Values

ValueCountFrequency (%)
0 39922
88.3%
1 5289
 
11.7%

Length

2022-12-05T14:37:17.993412image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Histogram of lengths of the category

Common Values (Plot)

2022-12-05T14:37:18.095417image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
ValueCountFrequency (%)
0 39922
88.3%
1 5289
 
11.7%

Most occurring characters

ValueCountFrequency (%)
0 39922
88.3%
1 5289
 
11.7%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 45211
100.0%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 39922
88.3%
1 5289
 
11.7%

Most occurring scripts

ValueCountFrequency (%)
Common 45211
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
0 39922
88.3%
1 5289
 
11.7%

Most occurring blocks

ValueCountFrequency (%)
ASCII 45211
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 39922
88.3%
1 5289
 
11.7%

Interactions

2022-12-05T14:37:12.671246image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:05.278695image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:06.465024image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:07.665921image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:08.711314image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:09.710266image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:10.704011image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:11.706480image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:12.798245image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:05.468714image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:06.598026image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:07.808994image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:08.846277image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:09.840257image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:10.833552image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:11.835663image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:12.923250image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:05.619750image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:06.728023image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:07.937190image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:08.974085image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:09.973408image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:10.961396image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:11.957567image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:13.045250image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:05.779737image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:06.858059image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:08.063317image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:09.094844image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:10.100448image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:11.088691image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:12.080303image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:13.172246image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:05.911737image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:06.982049image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:08.198798image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:09.209842image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:10.223408image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:11.214808image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:12.196136image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:13.290052image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:06.048629image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:07.232159image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:08.336370image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:09.336842image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:10.340410image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:11.337697image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:12.318137image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:13.418545image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:06.199034image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:07.413642image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:08.468331image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:09.471880image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:10.466449image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:11.469695image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:12.442104image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:13.535543image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:06.327055image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:07.535888image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:08.584277image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:09.590362image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:10.581446image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:11.584959image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
2022-12-05T14:37:12.551393image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Correlations

2022-12-05T14:37:18.189415image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Auto

The auto setting is an interpretable pairwise column metric of the following mapping:
  • Variable_type-Variable_type : Method, Range
  • Categorical-Categorical : Cramer's V, [0,1]
  • Numerical-Categorical : Cramer's V, [0,1] (using a discretized numerical column)
  • Numerical-Numerical : Spearman's ρ, [-1,1]
The number of bins used in the discretization for the Numerical-Categorical column pair can be changed using config.correlations["auto"].n_bins. The number of bins affects the granularity of the association you wish to measure.

This configuration uses the recommended metric for each pair of columns.
2022-12-05T14:37:18.596502image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Spearman's ρ

The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.

To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
2022-12-05T14:37:18.837589image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Pearson's r

The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.

To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
2022-12-05T14:37:19.071608image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Kendall's τ

Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.

To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
2022-12-05T14:37:19.293604image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Cramér's V (φc)

Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.
2022-12-05T14:37:19.483566image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/

Phik (φk)

Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.

Missing values

2022-12-05T14:37:13.885880image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
A simple visualization of nullity by column.
2022-12-05T14:37:14.199133image/svg+xmlMatplotlib v3.6.2, https://matplotlib.org/
Nullity matrix is a data-dense display which lets you quickly visually pick out patterns in data completion.

Sample

agemaritaleducationdefaultbalancehousingloancontactdaymonthdurationcampaignpdayspreviouspoutcomeTarget
0581302143100552611-1000
14402029100551511-1000
233120211055761-1000
347100150610055921-1000
4330001000551981-1000
535130231100551391-1000
628030447110552171-1000
7422312100553801-1000
85811012110055501-1000
94302059310055551-1000
agemaritaleducationdefaultbalancehousingloancontactdaymonthdurationcampaignpdayspreviouspoutcomeTarget
452015313058300117112261184411
452023402055700117112241-1001
452032303011300117112661-1001
452047312028500011711300140821
452052502050501117113862-1001
452065113082500117119773-1001
4520771210172900117114562-1001
45208721205715001171111275184311
452095712066800117115084-1000
45210371202971001171136121881130